49 research outputs found

    SuperSpike: Supervised learning in multi-layer spiking neural networks

    Full text link
    A vast majority of computation in the brain is performed by spiking neural networks. Despite the ubiquity of such spiking, we currently lack an understanding of how biological spiking neural circuits learn and compute in-vivo, as well as how we can instantiate such capabilities in artificial spiking circuits in-silico. Here we revisit the problem of supervised learning in temporally coding multi-layer spiking neural networks. First, by using a surrogate gradient approach, we derive SuperSpike, a nonlinear voltage-based three factor learning rule capable of training multi-layer networks of deterministic integrate-and-fire neurons to perform nonlinear computations on spatiotemporal spike patterns. Second, inspired by recent results on feedback alignment, we compare the performance of our learning rule under different credit assignment strategies for propagating output errors to hidden units. Specifically, we test uniform, symmetric and random feedback, finding that simpler tasks can be solved with any type of feedback, while more complex tasks require symmetric feedback. In summary, our results open the door to obtaining a better scientific understanding of learning and computation in spiking neural networks by advancing our ability to train them to solve nonlinear problems involving transformations between different spatiotemporal spike-time patterns

    Improving equilibrium propagation without weight symmetry through Jacobian homeostasis

    Full text link
    Equilibrium propagation (EP) is a compelling alternative to the backpropagation of error algorithm (BP) for computing gradients of neural networks on biological or analog neuromorphic substrates. Still, the algorithm requires weight symmetry and infinitesimal equilibrium perturbations, i.e., nudges, to estimate unbiased gradients efficiently. Both requirements are challenging to implement in physical systems. Yet, whether and how weight asymmetry affects its applicability is unknown because, in practice, it may be masked by biases introduced through the finite nudge. To address this question, we study generalized EP, which can be formulated without weight symmetry, and analytically isolate the two sources of bias. For complex-differentiable non-symmetric networks, we show that the finite nudge does not pose a problem, as exact derivatives can still be estimated via a Cauchy integral. In contrast, weight asymmetry introduces bias resulting in low task performance due to poor alignment of EP's neuronal error vectors compared to BP. To mitigate this issue, we present a new homeostatic objective that directly penalizes functional asymmetries of the Jacobian at the network's fixed point. This homeostatic objective dramatically improves the network's ability to solve complex tasks such as ImageNet 32x32. Our results lay the theoretical groundwork for studying and mitigating the adverse effects of imperfections of physical networks on learning algorithms that rely on the substrate's relaxation dynamics

    The remarkable robustness of surrogate gradient learning for instilling complex function in spiking neural networks

    Get PDF
    Brains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. In comparison, the functional capabilities of models of spiking networks are still rudimentary. This shortcoming is mainly due to the lack of insight and practical algorithms to construct the necessary connectivity. Any such algorithm typically attempts to build networks by iteratively reducing the error compared to a desired output. But assigning credit to hidden units in multi-layered spiking networks has remained challenging due to the non-differentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity in spiking network models. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients impact learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative’s scale can substantially affect learning performance. When we combine surrogate gradients with a suitable activity regularization technique, robust information processing can be achieved in spiking networks even at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks

    Memory formation and recall in recurrent spiking neural networks

    Get PDF
    Our brain has the capacity to analyze a visual scene in a split second, to learn how to play an instrument, and to remember events, faces and concepts. Neurons underlie all of these diverse functions. Neurons, cells within the brain that generate and transmit electrical activity, communicate with each other through chemical synapses. These synaptic connections dynamically change with experience, a process referred to as synaptic plasticity, which is thought to be at the core of the brain's ability to learn and process the world in sophisticated ways. Our understanding of the rules of synaptic plasticity remains quite limited. To enable efficient computations among neurons or to serve as a trace of memory, synapses must create stable connectivity patterns between neurons. However there remains an insufficient theoretical explanation as to how stable connectivity patterns can form in the presence of synaptic plasticity. Since the dynamics of recurrently connected neurons depend upon their connections, which themselves change in response to the network dynamics, synaptic plasticity and network dynamics have to be treated as a compound system. Due to the nonlinear nature of the system this can be analytically challenging. Utilizing network simulations that model the interplay between the network connectivity and synaptic plasticity can provide valuable insights. However, many existing network models that implement biologically relevant forms of plasticity become unstable. This suggests that current models do not accurately describe the biological networks, which have no difficulty functioning without succumbing to exploding network activity. The instability in these network simulations could originate from the fact that theoretical studies have, almost exclusively, focused on Hebbian plasticity at excitatory synapses. Hebbian plasticity causes connected neurons that are active together to increase the connection strength between them. Biological networks, however, display a large variety of different forms of synaptic plasticity and homeostatic mechanisms, beyond Hebbian plasticity. Furthermore, inhibitory cells can undergo synaptic plasticity as well. These diverse forms of plasticity are active at the same time, and our understanding of the computational role of most of these synaptic dynamics remains elusive. This raises the important question as to whether forms of plasticity that have not been previously considered could -in combination with Hebbian plasticity- lead to stable network dynamics. Here we illustrate that by combining multiple forms of plasticity with distinct roles, a recurrently connected spiking network model self-organizes to distinguish and extract multiple overlapping external stimuli. Moreover we show that the acquired network structures remain stable over hours while plasticity is active. This long-term stability allows the network to function as an associative memory and to correctly classify distorted or partially cued stimuli. During intervals in which no stimulus is shown the network dynamically remembers the last stimulus as selective delay activity. Taken together this work suggest that multiple forms of plasticity and homeostasis on different timescales have to work together to create stable connectivity patterns in neuronal networks which enable them to perform relevant computation

    Predictor networks and stop-grads provide implicit variance regularization in BYOL/SimSiam

    Full text link
    Self-supervised learning (SSL) learns useful representations from unlabelled data by training networks to be invariant to pairs of augmented versions of the same input. Non-contrastive methods avoid collapse either by directly regularizing the covariance matrix of network outputs or through asymmetric loss architectures, two seemingly unrelated approaches. Here, by building on DirectPred, we lay out a theoretical framework that reconciles these two views. We derive analytical expressions for the representational learning dynamics in linear networks. By expressing them in the eigenspace of the embedding covariance matrix, where the solutions decouple, we reveal the mechanism and conditions that provide implicit variance regularization. These insights allow us to formulate a new isotropic loss function that equalizes eigenvalue contribution and renders learning more robust. Finally, we show empirically that our findings translate to nonlinear networks trained on CIFAR-10 and STL-10

    Surrogate Gradient Learning in Spiking Neural Networks

    Get PDF
    Spiking neural networks are nature's versatile solution to fault-tolerant and energy efficient signal processing. To translate these benefits into hardware, a growing number of neuromorphic spiking neural network processors attempt to emulate biological neural networks. These developments have created an imminent need for methods and tools to enable such systems to solve real-world signal processing problems. Like conventional neural networks, spiking neural networks can be trained on real, domain specific data. However, their training requires overcoming a number of challenges linked to their binary and dynamical nature. This article elucidates step-by-step the problems typically encountered when training spiking neural networks, and guides the reader through the key concepts of synaptic plasticity and data-driven learning in the spiking setting. To that end, it gives an overview of existing approaches and provides an introduction to surrogate gradient methods, specifically, as a particularly flexible and efficient method to overcome the aforementioned challenges
    corecore